This readme file explains the structure of the replication directory.

Please direct any questions, comments, or spotted errors to: 
marko.klasnja@gmail.com

---------------------------------------------------------------------
Data
---------------------------------------------------------------------

1) electoral_data.dta = main analysis file, contatining election results
	for the three election triplets, as well as pretreatment 
	covariaties use for balance testing. For party codes, see
	party_codes.txt in this directory. 

2) wealth_data.dta = the dataset used for results on wealth accumulation.
	The data are drawn from the publicly released declarations of
	assets at the designated web portal of the Romanian Integrity
	Agency: http://declaratii.integritate.eu/home/navigare/cautare-avansata.aspx
	The data were collected by research assistants who coded each
	declaration manually. The dataset contains links for most 
	declarations used (unfortunately, not all the observations 
	contain the link, since unlike later coding rounds, in several 
	initial coding rounds I did not instruct my research assistants
	to record the declaration link. Apologies for these data
	gaps.)

3) corruption_data.dta = the dataset containing the measures of corruption
	used in the analysis. The directory contains the primary
	data from which the measures are created, and a do-file that
	lays out the steps of how the measures were created: 

	a) _data_gov_contracts.dta contains procurement contract 
	variables from http://data.gov.ro/dataset?groups=achizitii-publice,
	the official government repository of procurement data (last
	downloaded in February 2014; the data are continually updated
	and cleaned). These data are used to construct the "opaque"
	procedure" corruption measure and the "single bidder" 
	corruption measure. 

	b) _pp_direct_acquisitions.dta contains data on direct 
	acquisition contracts, as scraped from the html data provided
	by a private provider, http://www.tender-service.com/. These
	data are used to construct the "price per quantity" measure.

	c) _infrastructure_fiscal_data.dta contains data on changes
	in infrastructure as well as data on local revenues and 
	expenditures and control variables used to create a measure
	of "missing infrastructure." The data come from several sources,
	including the Ministry of Finance, Ministry of Regional Development,
	Romanian National Institute of Statistics, and Expert Forum. 

4) survey_data.dta = the dataset used to test the assumption behind 
	prediction 2 in the paper. The data combines 10 surveys: 
	the Romanian Opinion Barometer (May 2002, October 2002, May 2003,
	October 2003, May 2004, October 2004, May 2005, October 2006, 
	October 2007), and the Romanian Electoral Studies November 2009
	survey. For more details, see: 
	http://www.fundatia.ro/en/public-opinion-barometer
	https://resproject.wordpress.com/


5) pop_bw_data.dta = the dataset containing a number of predetermined
	variables used to construct the balanced population window, 
	described in detail in Section A6 in the Supplementary 
	Appendix. See that section for more details. 

6) 2008_salary_data.dta = the dataset containing the wealth information
	from the 2008 asset declarations of individuals who were 
	mayors at the time of the 2008 election. These data are used
	to verify that the 7,000 population/salary threshold used
	for identification in the paper actually does produce a 
	jump in the mayoral salary at the threshold. 

7) census_pop_data.dta = the locality population counts based on the
	2002 and 2011 census data. This dataset is used to test 
	whether the census counts are manipulated by mayors. The 
	results are reported in Table A9 in the Supplementary
	Appendix. 

8) prosecution_data.dta = the dataset contains summary measures
	of corruption prosecutions by the Romanian Anti-Corruption
	Directorate (DNA) and the Romanian National Integrity
	Agency (ANI). The data are discussed in the Supplementary
	Appendix, and the results are reported in Table 
	A14. The data were coded by research assistants
	from each agency's press releases: 

	http://www.pna.ro/comunicate.xhtml
	https://www.integritate.eu/Media/Comunicate-de-pres%C4%83.aspx


---------------------------------------------------------------------
Do-files
---------------------------------------------------------------------

Please note the	comments at the beginning of the do-file for important
	information. 

9) Main_text_replication.do = the do-file that replicates the results
	reported in the main body of the paper. 

10) Supp_Appendix_replication.do = the do-file that replicates the 
	results reported in the Supplementary Appendix. 

11) Create_corruption_measures.do = the do-file that creates the
	corruption measures from the datasets (3a)-(3c), and merges them
	into a single dataset, corruption_data.dta, which is placed in the
	same directory. 
